QVAC-12239 feat[api]: add qvac doctor command and system-requirements doc#1681
Merged
simon-iribarren merged 15 commits intoApr 28, 2026
Conversation
91f333c to
d0a5699
Compare
…(QVAC-12239) Validates host preconditions (Node version, platform/arch, RAM, free disk, ffmpeg/bare/bun presence, @qvac/sdk installation) before runtime so users get actionable hints instead of opaque crashes. Ships with human and --json output, unit tests, bats smoke tests, and a published system-requirements.md.
Matches the industry-standard verb (brew doctor, flutter doctor, npm doctor) and aligns with the pitch's naming so we can grow host + installed-model checks under a single command without a follow-up breaking change.
…2239) Addresses correctness gaps found in the initial doctor implementation. - Rename `Platform / architecture` -> `CLI host`. The CLI itself runs on desktops only; mobile platforms are SDK *deploy* targets, not CLI hosts. Surface that distinction in labels, hints, and docs. - Add a dedicated "Deploy targets (SDK)" section that reports the full SDK target matrix (DEFAULT_HOSTS in bundle-sdk/constants) split into Desktop (informational/always pass via bare-pack cross-bundling), Android (checks `adb --version`), and iOS (checks `xcodebuild -version` on macOS hosts; informational on non-macOS hosts). - Use `os.availableMemory()` (Node 22+) instead of `os.freemem()` so the RAM check doesn't produce noisy false warnings on Linux/macOS where "free" under-reports reclaimable page cache. - Fall back to POSIX `df -Pk` when `fs.statfsSync` is unavailable so the disk-space check actually runs on Node 18.0-18.14 hosts instead of silently skipping. - Resolve `@qvac/sdk` with `createRequire(projectRoot)` + `require.resolve` so hoisted installs (monorepos, Yarn/Bun workspaces) are found correctly. Use DEFAULT_SDK_NAME from bundle-sdk/constants as the single source of truth. - Introduce `info` CheckStatus + `informational` CheckSeverity for the iOS-on-non-darwin case, so it's clearly neither a warning nor a failure. - Cover every new code path in unit tests (new hoisted-SDK test, all target-check permutations, info status handling).
39edd0c to
f22ffd1
Compare
NamelsKing
reviewed
Apr 23, 2026
probeBinary uses spawnSync with no timeout, which means a misbehaving binary on PATH (e.g. adb waiting for a device, an interactive --version that stalls on TTY detection) can hang `qvac doctor` indefinitely. Set timeout: 3000ms on the spawnSync call. 3s is generous for a --version-style invocation while keeping the command snappy. Treat SIGTERM explicitly as a failed probe so the tool is reported as "not found" rather than silently hanging. Addresses PR tetherto#1681 review comment from @NamelsKing.
The fail branch (< 2 GB) reported severity 'required' while the warn and pass branches reported 'recommended'. Severity is meant to describe the check itself, not the outcome of a specific branch — flipping it makes report consumers that pivot by severity (e.g. "list all required checks that passed") produce wrong results. Total RAM has a hard gate (<2 GB fails the report) with a recommended band on top, so the check is 'required' across all branches — the same pattern checkNodeVersion already uses. Also add a unit test locking this invariant across fail/warn/pass so the inconsistency cannot silently come back. Addresses PR tetherto#1681 review comment from @NamelsKing.
…rm Check interface (QVAC-12239)
Previously all checks lived in a single 493-line src/doctor/checks.ts
with ad-hoc signatures (checkNodeVersion(version?), checkFfmpeg(probe),
checkDesktopTargets(platform, arch), …). Adding a new check meant
inventing yet another signature and remembering to wire it into
collectCheckSections, and tests had to know which knobs each check
accepted.
Introduce a single abstraction:
type Check = (ctx: CheckContext) => CheckResult
where CheckContext carries everything a check can read (projectRoot,
platform, arch, nodeVersion, totalMemoryBytes, availableMemoryBytes,
probe). createDefaultContext() reads the live host once; tests build a
context with exactly the overrides they need and get deterministic,
host-independent runs.
Files split out by report section:
src/doctor/check.ts -- CheckContext, Check, probeBinary, createDefaultContext
src/doctor/checks/runtime.ts -- node version, CLI host
src/doctor/checks/hardware.ts -- total/available RAM, free disk
src/doctor/checks/targets.ts -- desktop, android, ios
src/doctor/checks/tools.ts -- ffmpeg, bare, bun
src/doctor/checks/project.ts -- @qvac/sdk resolvable
src/doctor/checks/index.ts -- collectCheckSections, isReportOk, barrel re-exports
The orchestrator in src/doctor/index.ts is unchanged apart from the
import path. Existing behaviour and JSON shape are identical; this is
purely structural.
Tests are reworked to go through the new interface via a small makeCtx()
helper. Two new tests cover collectCheckSections({ context }) override
and createDefaultContext(). 39 unit tests + 36 bats smoke tests pass.
Addresses PR tetherto#1681 review comment from @lauripiisang.
QVAC inference backends use Metal on macOS (ggml-metal is hard-linked
in the whispercpp addon CMakeLists) and Vulkan on Linux/Windows (the
whisper-cpp vulkan feature and llama.cpp's Vulkan0/Vulkan1 device
enumeration). Running LLM or Whisper inference without a GPU backend
falls back to CPU, which is roughly an order of magnitude slower.
`qvac doctor` previously gave users no visibility into whether GPU
acceleration was actually wired up on their host, so they could spin
up inference, see slow CPU-only throughput, and have no way to diagnose
why.
Add a platform-aware GPU check under the Hardware section:
- darwin -> pass, "Metal (native macOS backend)"
(Metal is always present on supported macOS versions)
- linux -> probe `vulkaninfo --summary`
-> pass + extracted deviceName list (e.g. "NVIDIA RTX 3080")
-> warn with apt/dnf install hint when the ICD is missing
- win32 -> probe `vulkaninfo --summary`
-> pass + deviceName list
-> warn with Vulkan SDK / GPU-driver hint when missing
- other -> info, "not checked on <platform>"
Severity is 'recommended' — missing GPU acceleration does not fail the
report (inference still runs on CPU), but a warning is surfaced with a
concrete remediation link so users can fix throughput proactively.
Implementation notes:
- ProbeResult gains an optional `stdout` field so probes can return
full multi-line output. The GPU check parses `deviceName = ...`
lines to surface which GPUs the Vulkan loader sees.
- The 3s probeBinary timeout applies automatically — a stuck
vulkaninfo call will not hang `qvac doctor`.
- 6 new unit tests cover darwin/linux/win32/unknown-platform paths,
device-name extraction, and the no-device fallback.
Docs (README + system-requirements.md) updated.
Addresses PR tetherto#1681 review comment from @NamelsKing.
lauripiisang
previously approved these changes
Apr 23, 2026
Victor-Rodzko
previously approved these changes
Apr 24, 2026
# Conflicts: # packages/cli/package.json
f425bd3
NamelsKing
approved these changes
Apr 28, 2026
lauripiisang
approved these changes
Apr 28, 2026
Contributor
Author
|
/review |
Contributor
Tier-based Approval Status |
6 tasks
simon-iribarren
added a commit
that referenced
this pull request
Apr 28, 2026
…pe (#1775) PR #1596 (Apr 15) updated the CLI's `sdkEmbed` for the new SDK 0.9+ embed() return shape — it changed from raw `Promise<number[] | number[][]>` to wrapped `Promise<{ embedding, stats? }>`. The CLI's `sdkEmbed` destructures `{ embedding }` from the result, then the OpenAI embeddings route reads it as `embeddings[0]`. That PR landed the runtime change but did not bump the `@qvac/sdk` semver in `packages/cli/package.json`. It still requires `^0.8.0`, which in npm semver means `>=0.8.0 <0.9.0`. CI installs @qvac/sdk@0.8.0, whose `embed()` returns the raw vector array. The CLI then destructures `{ embedding }` from a number array, gets `undefined`, and the route handler crashes on `embeddings[0]`. The error is caught, the response becomes a 500 with `{ error: { code: 'embed_error' } }`, and the e2e tests asserting `.object == "list"` fail with no obvious hint as to why: not ok 40 embeddings: single input returns vector (e2e.bats:184) not ok 41 embeddings: batch input returns multiple vectors (e2e.bats:198) These two tests have been red on every CLI PR's CI since #1596 merged (visible on PR #1681, #1766, and others); chat and transcription tests are unaffected because their SDK contracts didn't change. Bump `@qvac/sdk` to `^0.9.0` so the lockfile picks up 0.9.x at install time, and bump the runtime `MIN_SDK_VERSION` constant to match. The SDK 0.9 series is published on npm (0.9.0 and 0.9.1 both available). Verified locally: rm -rf node_modules bun.lock && bun install resolves @qvac/sdk to 0.9.1; full e2e suite now passes the embeddings tests: ok 6 embeddings: single input returns vector ok 7 embeddings: batch input returns multiple vectors
Contributor
Author
|
/review |
GustavoA1604
pushed a commit
that referenced
this pull request
Apr 29, 2026
…pe (#1775) PR #1596 (Apr 15) updated the CLI's `sdkEmbed` for the new SDK 0.9+ embed() return shape — it changed from raw `Promise<number[] | number[][]>` to wrapped `Promise<{ embedding, stats? }>`. The CLI's `sdkEmbed` destructures `{ embedding }` from the result, then the OpenAI embeddings route reads it as `embeddings[0]`. That PR landed the runtime change but did not bump the `@qvac/sdk` semver in `packages/cli/package.json`. It still requires `^0.8.0`, which in npm semver means `>=0.8.0 <0.9.0`. CI installs @qvac/sdk@0.8.0, whose `embed()` returns the raw vector array. The CLI then destructures `{ embedding }` from a number array, gets `undefined`, and the route handler crashes on `embeddings[0]`. The error is caught, the response becomes a 500 with `{ error: { code: 'embed_error' } }`, and the e2e tests asserting `.object == "list"` fail with no obvious hint as to why: not ok 40 embeddings: single input returns vector (e2e.bats:184) not ok 41 embeddings: batch input returns multiple vectors (e2e.bats:198) These two tests have been red on every CLI PR's CI since #1596 merged (visible on PR #1681, #1766, and others); chat and transcription tests are unaffected because their SDK contracts didn't change. Bump `@qvac/sdk` to `^0.9.0` so the lockfile picks up 0.9.x at install time, and bump the runtime `MIN_SDK_VERSION` constant to match. The SDK 0.9 series is published on npm (0.9.0 and 0.9.1 both available). Verified locally: rm -rf node_modules bun.lock && bun install resolves @qvac/sdk to 0.9.1; full e2e suite now passes the embeddings tests: ok 6 embeddings: single input returns vector ok 7 embeddings: batch input returns multiple vectors
GustavoA1604
pushed a commit
that referenced
this pull request
Apr 29, 2026
… doc (#1681) * feat[api]: add qvac check-system command and system-requirements doc (QVAC-12239) Validates host preconditions (Node version, platform/arch, RAM, free disk, ffmpeg/bare/bun presence, @qvac/sdk installation) before runtime so users get actionable hints instead of opaque crashes. Ships with human and --json output, unit tests, bats smoke tests, and a published system-requirements.md. * mod: rename qvac check-system to qvac doctor (QVAC-12239) Matches the industry-standard verb (brew doctor, flutter doctor, npm doctor) and aligns with the pitch's naming so we can grow host + installed-model checks under a single command without a follow-up breaking change. * mod: expand qvac doctor checks for correctness and completion (QVAC-12239) Addresses correctness gaps found in the initial doctor implementation. - Rename `Platform / architecture` -> `CLI host`. The CLI itself runs on desktops only; mobile platforms are SDK *deploy* targets, not CLI hosts. Surface that distinction in labels, hints, and docs. - Add a dedicated "Deploy targets (SDK)" section that reports the full SDK target matrix (DEFAULT_HOSTS in bundle-sdk/constants) split into Desktop (informational/always pass via bare-pack cross-bundling), Android (checks `adb --version`), and iOS (checks `xcodebuild -version` on macOS hosts; informational on non-macOS hosts). - Use `os.availableMemory()` (Node 22+) instead of `os.freemem()` so the RAM check doesn't produce noisy false warnings on Linux/macOS where "free" under-reports reclaimable page cache. - Fall back to POSIX `df -Pk` when `fs.statfsSync` is unavailable so the disk-space check actually runs on Node 18.0-18.14 hosts instead of silently skipping. - Resolve `@qvac/sdk` with `createRequire(projectRoot)` + `require.resolve` so hoisted installs (monorepos, Yarn/Bun workspaces) are found correctly. Use DEFAULT_SDK_NAME from bundle-sdk/constants as the single source of truth. - Introduce `info` CheckStatus + `informational` CheckSeverity for the iOS-on-non-darwin case, so it's clearly neither a warning nor a failure. - Cover every new code path in unit tests (new hoisted-SDK test, all target-check permutations, info status handling). * fix: cap doctor probeBinary at 3s to prevent hangs (QVAC-12239) probeBinary uses spawnSync with no timeout, which means a misbehaving binary on PATH (e.g. adb waiting for a device, an interactive --version that stalls on TTY detection) can hang `qvac doctor` indefinitely. Set timeout: 3000ms on the spawnSync call. 3s is generous for a --version-style invocation while keeping the command snappy. Treat SIGTERM explicitly as a failed probe so the tool is reported as "not found" rather than silently hanging. Addresses PR #1681 review comment from @NamelsKing. * mod: normalize checkTotalMemory severity to 'required' (QVAC-12239) The fail branch (< 2 GB) reported severity 'required' while the warn and pass branches reported 'recommended'. Severity is meant to describe the check itself, not the outcome of a specific branch — flipping it makes report consumers that pivot by severity (e.g. "list all required checks that passed") produce wrong results. Total RAM has a hard gate (<2 GB fails the report) with a recommended band on top, so the check is 'required' across all branches — the same pattern checkNodeVersion already uses. Also add a unit test locking this invariant across fail/warn/pass so the inconsistency cannot silently come back. Addresses PR #1681 review comment from @NamelsKing. * refactor: split doctor checks into per-section modules behind a uniform Check interface (QVAC-12239) Previously all checks lived in a single 493-line src/doctor/checks.ts with ad-hoc signatures (checkNodeVersion(version?), checkFfmpeg(probe), checkDesktopTargets(platform, arch), …). Adding a new check meant inventing yet another signature and remembering to wire it into collectCheckSections, and tests had to know which knobs each check accepted. Introduce a single abstraction: type Check = (ctx: CheckContext) => CheckResult where CheckContext carries everything a check can read (projectRoot, platform, arch, nodeVersion, totalMemoryBytes, availableMemoryBytes, probe). createDefaultContext() reads the live host once; tests build a context with exactly the overrides they need and get deterministic, host-independent runs. Files split out by report section: src/doctor/check.ts -- CheckContext, Check, probeBinary, createDefaultContext src/doctor/checks/runtime.ts -- node version, CLI host src/doctor/checks/hardware.ts -- total/available RAM, free disk src/doctor/checks/targets.ts -- desktop, android, ios src/doctor/checks/tools.ts -- ffmpeg, bare, bun src/doctor/checks/project.ts -- @qvac/sdk resolvable src/doctor/checks/index.ts -- collectCheckSections, isReportOk, barrel re-exports The orchestrator in src/doctor/index.ts is unchanged apart from the import path. Existing behaviour and JSON shape are identical; this is purely structural. Tests are reworked to go through the new interface via a small makeCtx() helper. Two new tests cover collectCheckSections({ context }) override and createDefaultContext(). 39 unit tests + 36 bats smoke tests pass. Addresses PR #1681 review comment from @lauripiisang. * feat: add GPU acceleration check to qvac doctor (QVAC-12239) QVAC inference backends use Metal on macOS (ggml-metal is hard-linked in the whispercpp addon CMakeLists) and Vulkan on Linux/Windows (the whisper-cpp vulkan feature and llama.cpp's Vulkan0/Vulkan1 device enumeration). Running LLM or Whisper inference without a GPU backend falls back to CPU, which is roughly an order of magnitude slower. `qvac doctor` previously gave users no visibility into whether GPU acceleration was actually wired up on their host, so they could spin up inference, see slow CPU-only throughput, and have no way to diagnose why. Add a platform-aware GPU check under the Hardware section: - darwin -> pass, "Metal (native macOS backend)" (Metal is always present on supported macOS versions) - linux -> probe `vulkaninfo --summary` -> pass + extracted deviceName list (e.g. "NVIDIA RTX 3080") -> warn with apt/dnf install hint when the ICD is missing - win32 -> probe `vulkaninfo --summary` -> pass + deviceName list -> warn with Vulkan SDK / GPU-driver hint when missing - other -> info, "not checked on <platform>" Severity is 'recommended' — missing GPU acceleration does not fail the report (inference still runs on CPU), but a warning is surfaced with a concrete remediation link so users can fix throughput proactively. Implementation notes: - ProbeResult gains an optional `stdout` field so probes can return full multi-line output. The GPU check parses `deviceName = ...` lines to surface which GPUs the Vulkan loader sees. - The 3s probeBinary timeout applies automatically — a stuck vulkaninfo call will not hang `qvac doctor`. - 6 new unit tests cover darwin/linux/win32/unknown-platform paths, device-name extraction, and the no-device fallback. Docs (README + system-requirements.md) updated. Addresses PR #1681 review comment from @NamelsKing.
Proletter
pushed a commit
that referenced
this pull request
May 24, 2026
…pe (#1775) PR #1596 (Apr 15) updated the CLI's `sdkEmbed` for the new SDK 0.9+ embed() return shape — it changed from raw `Promise<number[] | number[][]>` to wrapped `Promise<{ embedding, stats? }>`. The CLI's `sdkEmbed` destructures `{ embedding }` from the result, then the OpenAI embeddings route reads it as `embeddings[0]`. That PR landed the runtime change but did not bump the `@qvac/sdk` semver in `packages/cli/package.json`. It still requires `^0.8.0`, which in npm semver means `>=0.8.0 <0.9.0`. CI installs @qvac/sdk@0.8.0, whose `embed()` returns the raw vector array. The CLI then destructures `{ embedding }` from a number array, gets `undefined`, and the route handler crashes on `embeddings[0]`. The error is caught, the response becomes a 500 with `{ error: { code: 'embed_error' } }`, and the e2e tests asserting `.object == "list"` fail with no obvious hint as to why: not ok 40 embeddings: single input returns vector (e2e.bats:184) not ok 41 embeddings: batch input returns multiple vectors (e2e.bats:198) These two tests have been red on every CLI PR's CI since #1596 merged (visible on PR #1681, #1766, and others); chat and transcription tests are unaffected because their SDK contracts didn't change. Bump `@qvac/sdk` to `^0.9.0` so the lockfile picks up 0.9.x at install time, and bump the runtime `MIN_SDK_VERSION` constant to match. The SDK 0.9 series is published on npm (0.9.0 and 0.9.1 both available). Verified locally: rm -rf node_modules bun.lock && bun install resolves @qvac/sdk to 0.9.1; full e2e suite now passes the embeddings tests: ok 6 embeddings: single input returns vector ok 7 embeddings: batch input returns multiple vectors
Proletter
pushed a commit
that referenced
this pull request
May 24, 2026
… doc (#1681) * feat[api]: add qvac check-system command and system-requirements doc (QVAC-12239) Validates host preconditions (Node version, platform/arch, RAM, free disk, ffmpeg/bare/bun presence, @qvac/sdk installation) before runtime so users get actionable hints instead of opaque crashes. Ships with human and --json output, unit tests, bats smoke tests, and a published system-requirements.md. * mod: rename qvac check-system to qvac doctor (QVAC-12239) Matches the industry-standard verb (brew doctor, flutter doctor, npm doctor) and aligns with the pitch's naming so we can grow host + installed-model checks under a single command without a follow-up breaking change. * mod: expand qvac doctor checks for correctness and completion (QVAC-12239) Addresses correctness gaps found in the initial doctor implementation. - Rename `Platform / architecture` -> `CLI host`. The CLI itself runs on desktops only; mobile platforms are SDK *deploy* targets, not CLI hosts. Surface that distinction in labels, hints, and docs. - Add a dedicated "Deploy targets (SDK)" section that reports the full SDK target matrix (DEFAULT_HOSTS in bundle-sdk/constants) split into Desktop (informational/always pass via bare-pack cross-bundling), Android (checks `adb --version`), and iOS (checks `xcodebuild -version` on macOS hosts; informational on non-macOS hosts). - Use `os.availableMemory()` (Node 22+) instead of `os.freemem()` so the RAM check doesn't produce noisy false warnings on Linux/macOS where "free" under-reports reclaimable page cache. - Fall back to POSIX `df -Pk` when `fs.statfsSync` is unavailable so the disk-space check actually runs on Node 18.0-18.14 hosts instead of silently skipping. - Resolve `@qvac/sdk` with `createRequire(projectRoot)` + `require.resolve` so hoisted installs (monorepos, Yarn/Bun workspaces) are found correctly. Use DEFAULT_SDK_NAME from bundle-sdk/constants as the single source of truth. - Introduce `info` CheckStatus + `informational` CheckSeverity for the iOS-on-non-darwin case, so it's clearly neither a warning nor a failure. - Cover every new code path in unit tests (new hoisted-SDK test, all target-check permutations, info status handling). * fix: cap doctor probeBinary at 3s to prevent hangs (QVAC-12239) probeBinary uses spawnSync with no timeout, which means a misbehaving binary on PATH (e.g. adb waiting for a device, an interactive --version that stalls on TTY detection) can hang `qvac doctor` indefinitely. Set timeout: 3000ms on the spawnSync call. 3s is generous for a --version-style invocation while keeping the command snappy. Treat SIGTERM explicitly as a failed probe so the tool is reported as "not found" rather than silently hanging. Addresses PR #1681 review comment from @NamelsKing. * mod: normalize checkTotalMemory severity to 'required' (QVAC-12239) The fail branch (< 2 GB) reported severity 'required' while the warn and pass branches reported 'recommended'. Severity is meant to describe the check itself, not the outcome of a specific branch — flipping it makes report consumers that pivot by severity (e.g. "list all required checks that passed") produce wrong results. Total RAM has a hard gate (<2 GB fails the report) with a recommended band on top, so the check is 'required' across all branches — the same pattern checkNodeVersion already uses. Also add a unit test locking this invariant across fail/warn/pass so the inconsistency cannot silently come back. Addresses PR #1681 review comment from @NamelsKing. * refactor: split doctor checks into per-section modules behind a uniform Check interface (QVAC-12239) Previously all checks lived in a single 493-line src/doctor/checks.ts with ad-hoc signatures (checkNodeVersion(version?), checkFfmpeg(probe), checkDesktopTargets(platform, arch), …). Adding a new check meant inventing yet another signature and remembering to wire it into collectCheckSections, and tests had to know which knobs each check accepted. Introduce a single abstraction: type Check = (ctx: CheckContext) => CheckResult where CheckContext carries everything a check can read (projectRoot, platform, arch, nodeVersion, totalMemoryBytes, availableMemoryBytes, probe). createDefaultContext() reads the live host once; tests build a context with exactly the overrides they need and get deterministic, host-independent runs. Files split out by report section: src/doctor/check.ts -- CheckContext, Check, probeBinary, createDefaultContext src/doctor/checks/runtime.ts -- node version, CLI host src/doctor/checks/hardware.ts -- total/available RAM, free disk src/doctor/checks/targets.ts -- desktop, android, ios src/doctor/checks/tools.ts -- ffmpeg, bare, bun src/doctor/checks/project.ts -- @qvac/sdk resolvable src/doctor/checks/index.ts -- collectCheckSections, isReportOk, barrel re-exports The orchestrator in src/doctor/index.ts is unchanged apart from the import path. Existing behaviour and JSON shape are identical; this is purely structural. Tests are reworked to go through the new interface via a small makeCtx() helper. Two new tests cover collectCheckSections({ context }) override and createDefaultContext(). 39 unit tests + 36 bats smoke tests pass. Addresses PR #1681 review comment from @lauripiisang. * feat: add GPU acceleration check to qvac doctor (QVAC-12239) QVAC inference backends use Metal on macOS (ggml-metal is hard-linked in the whispercpp addon CMakeLists) and Vulkan on Linux/Windows (the whisper-cpp vulkan feature and llama.cpp's Vulkan0/Vulkan1 device enumeration). Running LLM or Whisper inference without a GPU backend falls back to CPU, which is roughly an order of magnitude slower. `qvac doctor` previously gave users no visibility into whether GPU acceleration was actually wired up on their host, so they could spin up inference, see slow CPU-only throughput, and have no way to diagnose why. Add a platform-aware GPU check under the Hardware section: - darwin -> pass, "Metal (native macOS backend)" (Metal is always present on supported macOS versions) - linux -> probe `vulkaninfo --summary` -> pass + extracted deviceName list (e.g. "NVIDIA RTX 3080") -> warn with apt/dnf install hint when the ICD is missing - win32 -> probe `vulkaninfo --summary` -> pass + deviceName list -> warn with Vulkan SDK / GPU-driver hint when missing - other -> info, "not checked on <platform>" Severity is 'recommended' — missing GPU acceleration does not fail the report (inference still runs on CPU), but a warning is surfaced with a concrete remediation link so users can fix throughput proactively. Implementation notes: - ProbeResult gains an optional `stdout` field so probes can return full multi-line output. The GPU check parses `deviceName = ...` lines to surface which GPUs the Vulkan loader sees. - The 3s probeBinary timeout applies automatically — a stuck vulkaninfo call will not hang `qvac doctor`. - 6 new unit tests cover darwin/linux/win32/unknown-platform paths, device-name extraction, and the no-device fallback. Docs (README + system-requirements.md) updated. Addresses PR #1681 review comment from @NamelsKing.
Proletter
pushed a commit
that referenced
this pull request
May 24, 2026
…pe (#1775) PR #1596 (Apr 15) updated the CLI's `sdkEmbed` for the new SDK 0.9+ embed() return shape — it changed from raw `Promise<number[] | number[][]>` to wrapped `Promise<{ embedding, stats? }>`. The CLI's `sdkEmbed` destructures `{ embedding }` from the result, then the OpenAI embeddings route reads it as `embeddings[0]`. That PR landed the runtime change but did not bump the `@qvac/sdk` semver in `packages/cli/package.json`. It still requires `^0.8.0`, which in npm semver means `>=0.8.0 <0.9.0`. CI installs @qvac/sdk@0.8.0, whose `embed()` returns the raw vector array. The CLI then destructures `{ embedding }` from a number array, gets `undefined`, and the route handler crashes on `embeddings[0]`. The error is caught, the response becomes a 500 with `{ error: { code: 'embed_error' } }`, and the e2e tests asserting `.object == "list"` fail with no obvious hint as to why: not ok 40 embeddings: single input returns vector (e2e.bats:184) not ok 41 embeddings: batch input returns multiple vectors (e2e.bats:198) These two tests have been red on every CLI PR's CI since #1596 merged (visible on PR #1681, #1766, and others); chat and transcription tests are unaffected because their SDK contracts didn't change. Bump `@qvac/sdk` to `^0.9.0` so the lockfile picks up 0.9.x at install time, and bump the runtime `MIN_SDK_VERSION` constant to match. The SDK 0.9 series is published on npm (0.9.0 and 0.9.1 both available). Verified locally: rm -rf node_modules bun.lock && bun install resolves @qvac/sdk to 0.9.1; full e2e suite now passes the embeddings tests: ok 6 embeddings: single input returns vector ok 7 embeddings: batch input returns multiple vectors
Proletter
pushed a commit
that referenced
this pull request
May 24, 2026
… doc (#1681) * feat[api]: add qvac check-system command and system-requirements doc (QVAC-12239) Validates host preconditions (Node version, platform/arch, RAM, free disk, ffmpeg/bare/bun presence, @qvac/sdk installation) before runtime so users get actionable hints instead of opaque crashes. Ships with human and --json output, unit tests, bats smoke tests, and a published system-requirements.md. * mod: rename qvac check-system to qvac doctor (QVAC-12239) Matches the industry-standard verb (brew doctor, flutter doctor, npm doctor) and aligns with the pitch's naming so we can grow host + installed-model checks under a single command without a follow-up breaking change. * mod: expand qvac doctor checks for correctness and completion (QVAC-12239) Addresses correctness gaps found in the initial doctor implementation. - Rename `Platform / architecture` -> `CLI host`. The CLI itself runs on desktops only; mobile platforms are SDK *deploy* targets, not CLI hosts. Surface that distinction in labels, hints, and docs. - Add a dedicated "Deploy targets (SDK)" section that reports the full SDK target matrix (DEFAULT_HOSTS in bundle-sdk/constants) split into Desktop (informational/always pass via bare-pack cross-bundling), Android (checks `adb --version`), and iOS (checks `xcodebuild -version` on macOS hosts; informational on non-macOS hosts). - Use `os.availableMemory()` (Node 22+) instead of `os.freemem()` so the RAM check doesn't produce noisy false warnings on Linux/macOS where "free" under-reports reclaimable page cache. - Fall back to POSIX `df -Pk` when `fs.statfsSync` is unavailable so the disk-space check actually runs on Node 18.0-18.14 hosts instead of silently skipping. - Resolve `@qvac/sdk` with `createRequire(projectRoot)` + `require.resolve` so hoisted installs (monorepos, Yarn/Bun workspaces) are found correctly. Use DEFAULT_SDK_NAME from bundle-sdk/constants as the single source of truth. - Introduce `info` CheckStatus + `informational` CheckSeverity for the iOS-on-non-darwin case, so it's clearly neither a warning nor a failure. - Cover every new code path in unit tests (new hoisted-SDK test, all target-check permutations, info status handling). * fix: cap doctor probeBinary at 3s to prevent hangs (QVAC-12239) probeBinary uses spawnSync with no timeout, which means a misbehaving binary on PATH (e.g. adb waiting for a device, an interactive --version that stalls on TTY detection) can hang `qvac doctor` indefinitely. Set timeout: 3000ms on the spawnSync call. 3s is generous for a --version-style invocation while keeping the command snappy. Treat SIGTERM explicitly as a failed probe so the tool is reported as "not found" rather than silently hanging. Addresses PR #1681 review comment from @NamelsKing. * mod: normalize checkTotalMemory severity to 'required' (QVAC-12239) The fail branch (< 2 GB) reported severity 'required' while the warn and pass branches reported 'recommended'. Severity is meant to describe the check itself, not the outcome of a specific branch — flipping it makes report consumers that pivot by severity (e.g. "list all required checks that passed") produce wrong results. Total RAM has a hard gate (<2 GB fails the report) with a recommended band on top, so the check is 'required' across all branches — the same pattern checkNodeVersion already uses. Also add a unit test locking this invariant across fail/warn/pass so the inconsistency cannot silently come back. Addresses PR #1681 review comment from @NamelsKing. * refactor: split doctor checks into per-section modules behind a uniform Check interface (QVAC-12239) Previously all checks lived in a single 493-line src/doctor/checks.ts with ad-hoc signatures (checkNodeVersion(version?), checkFfmpeg(probe), checkDesktopTargets(platform, arch), …). Adding a new check meant inventing yet another signature and remembering to wire it into collectCheckSections, and tests had to know which knobs each check accepted. Introduce a single abstraction: type Check = (ctx: CheckContext) => CheckResult where CheckContext carries everything a check can read (projectRoot, platform, arch, nodeVersion, totalMemoryBytes, availableMemoryBytes, probe). createDefaultContext() reads the live host once; tests build a context with exactly the overrides they need and get deterministic, host-independent runs. Files split out by report section: src/doctor/check.ts -- CheckContext, Check, probeBinary, createDefaultContext src/doctor/checks/runtime.ts -- node version, CLI host src/doctor/checks/hardware.ts -- total/available RAM, free disk src/doctor/checks/targets.ts -- desktop, android, ios src/doctor/checks/tools.ts -- ffmpeg, bare, bun src/doctor/checks/project.ts -- @qvac/sdk resolvable src/doctor/checks/index.ts -- collectCheckSections, isReportOk, barrel re-exports The orchestrator in src/doctor/index.ts is unchanged apart from the import path. Existing behaviour and JSON shape are identical; this is purely structural. Tests are reworked to go through the new interface via a small makeCtx() helper. Two new tests cover collectCheckSections({ context }) override and createDefaultContext(). 39 unit tests + 36 bats smoke tests pass. Addresses PR #1681 review comment from @lauripiisang. * feat: add GPU acceleration check to qvac doctor (QVAC-12239) QVAC inference backends use Metal on macOS (ggml-metal is hard-linked in the whispercpp addon CMakeLists) and Vulkan on Linux/Windows (the whisper-cpp vulkan feature and llama.cpp's Vulkan0/Vulkan1 device enumeration). Running LLM or Whisper inference without a GPU backend falls back to CPU, which is roughly an order of magnitude slower. `qvac doctor` previously gave users no visibility into whether GPU acceleration was actually wired up on their host, so they could spin up inference, see slow CPU-only throughput, and have no way to diagnose why. Add a platform-aware GPU check under the Hardware section: - darwin -> pass, "Metal (native macOS backend)" (Metal is always present on supported macOS versions) - linux -> probe `vulkaninfo --summary` -> pass + extracted deviceName list (e.g. "NVIDIA RTX 3080") -> warn with apt/dnf install hint when the ICD is missing - win32 -> probe `vulkaninfo --summary` -> pass + deviceName list -> warn with Vulkan SDK / GPU-driver hint when missing - other -> info, "not checked on <platform>" Severity is 'recommended' — missing GPU acceleration does not fail the report (inference still runs on CPU), but a warning is surfaced with a concrete remediation link so users can fix throughput proactively. Implementation notes: - ProbeResult gains an optional `stdout` field so probes can return full multi-line output. The GPU check parses `deviceName = ...` lines to surface which GPUs the Vulkan loader sees. - The 3s probeBinary timeout applies automatically — a stuck vulkaninfo call will not hang `qvac doctor`. - 6 new unit tests cover darwin/linux/win32/unknown-platform paths, device-name extraction, and the no-device fallback. Docs (README + system-requirements.md) updated. Addresses PR #1681 review comment from @NamelsKing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Today users only find out about missing system dependencies (wrong Node version, missing
ffmpeg, unsupported platform, too little RAM / disk,@qvac/sdknot installed, …) through runtime errors deep in the SDK. There is no upfront validator and no publishedsystem-requirements.mdto read before installing.This is the pitch item "Add a system requirements checker (e.g.
bun run check-systemor integrated into CLI) that validates system dependencies before runtime" from the SDK DX & Tooling pitch (QVAC-12239). We ship it under thedoctorverb to matchbrew doctor/flutter doctor/npm doctor, and to give us room to grow host + installed-model checks under one command without a later breaking rename.How does it solve it?
qvac doctorthat runs a preflight against the host:>= 18, warns on< 20), platform/arch against the supported matrix fromDEFAULT_HOSTS.fs.statfsSync, gracefully skipped on Node< 19.6).ffmpeg(transcription/mic), Bare runtime, Bun. All warn-only.@qvac/sdkis resolvable fromnode_modules.--jsonfor CI/scripts,--quietfor exit-code-only usage. Exit0on success,1when any required check fails.packages/cli/system-requirements.mddocumenting the full matrix and JSON schema (this closes the "No system-requirements.md file" bullet in the Asana task).@qvac/clito0.3.0(new backward-compatible feature).New API
Sample output on a supported host:
JSON shape (see
system-requirements.mdfor the full schema):How was it tested?
bun run typecheck— clean.bun run test:unit— 26/26 pass (covers all pure check functions + injected probes for missing/presentffmpeg/bare/bun, SDK present/absent on disk, threshold behavior for Node version, RAM, and the supported host matrix).bun run test:bats— 36/36 pass (34 pre-existing + 2 new: help output,--jsonvalidity).qvac doctorandqvac doctor --jsonon macOS arm64 with Node 20.19.5.Out of scope
doctorshould eventually also "validate installed models". We ship host-only checks here and leave on-disk model validation for a follow-up so this first version stays scoped. Thedoctorname intentionally leaves room for that to land without another rename.qvac bundle sdk/qvac serve openaistartup — possible follow-up; out of scope here to keep the PR small.engines.nodeabove 18 — free disk check silently skips on< 19.6.Related